Objectives of the Data Preparation View

The Data Preparation View aims to support the sharing and reuse of prepared data assets. It looks to enhance data awareness among analytics users. It eases data understanding by providing a reference for data engineers (who prepare datasets) on data preparation activities.

Understanding the elements of the Data Preparation View Model

The main modeling elements of the Data Preparation View are Entities, Relationships Preparation Tasks, Operators, and Data Flows. The table below breaks down what each element represents, how they are applied, and their importance.

Type of Element Description
Entities Entities and their Relationships represent the raw data tables and their conceptual relationships. They also represent prepared datasets which are the eventual output of data preparation activities. The prepared datasets are connected to their corresponding Analytics Goals via the is required for link.
Relationships Preparation Tasks Data Preparation Tasks represent the general task of preparing data for accomplishing some Analytics Goals. Data Cleaning, Data Reduction, Data Transformation, and Data Integration are types of data preparation tasks.
Operators and Data Flows A Data Preparation Task consists of one or more Operators that are linked via Data Flows. This view is connected to the previous Analytics Design View through the is required for links.


Constructing the Data Preparation View Model

Refer to the following links to follow a step-by-step methodology of constructing Data Preparation View models:

Step 1: Understand the kinds of data needed for delivering the results
Step 2: Define the Prepared Dataset and Attributes on which Algorithms would be executed
Step 3: Decide and design the flow of Data Preparation Tasks that transform the input data into prepared datasets

Example of Data Preparation View Modelling in Practice


Below is an example illustrating a Data Preparation View Model. Using the same hospital setting from the Analytics Design View example, we can see a design of how a sample set of input datasets were transformed to construct and arrive at the prepared datasets.

Data Preparation View - Healthcare